Configuration
linkExtractor
Function for extracting URLs for links found on crawled pages.
By default, Algolia queues all URLs that comply with the pathsToMatch
and fileTypesToMatch
actions,
and the exclusionPatterns
parameter.
You can override this default logic by providing a linkExtractor
function that overrides this default logic and returns its own list of URLs to queue.
Parameters
A Cheerio instance with the HTML of the crawled page. For more information, see Extracting data with Cheerio.
A Cheerio instance with the HTML of the crawled page. The Crawler’s default URL discovery function It returns an array of strings, each representing a URL on the page that matches the crawler’s configuration.
URL of the page that was just crawled.
Examples
JavaScript
JavaScript
JavaScript